Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multi-objective reward model
# Multi-objective reward model
Gpt2 Large Harmless Reward Model
MIT
A large GPT2 model trained on the Anthropic/hh - rlhf harmless dataset, specifically for harmful response detection or reinforcement learning from human feedback (RLHF).
Large Language Model
Transformers
G
Ray2333
1,489
3
Featured Recommended AI Models
Empowering the Future, Your AI Solution Knowledge Base
English
简体中文
繁體中文
にほんご
© 2025
AIbase